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1.  Summary  of  Accomplishments: 


As  a  part  of  this  project,  we  explored  risk  based  approaches  for  securely  managing  data  in  the 
cloud.  Basically,  inspired  by  how  living  organisms  manage  risks  in  nature  while  preserving  and 
conserving  energy,  we  developed  novel  risk  based  approaches  that  balances  risk  (i.e.,  potential 
sensitive  data  disclosure  risk),  computational  cost  (i.e.,  how  long  will  it  take  to  run  the  security 
enhanced  tasks  in  the  cloud?),  monetary  cost  (i.e.,  how  much  more  you  would  pay  to  the  cloud 
service  provider  due  to  security  enhanced  cloud  computing?).  In  different  settings,  we  solve  the 
variants  of  the  following  multi-objective  optimization  framework  where  we  find  best  query 
execution  plan  Q  among  all  possible  query  plans  that  minimizes  the  total  run  time  while  it  does 
not  exceed  the  predefined  monetary  costs  and  risk  measures. 

minimizeQe()  RunTime(Q) 
subject  to  (1)  MonetaryCost(Q)  <  M 
(2)  Risk(Q)  <  R 


To  our  knowledge,  as  also  reported  in  Network  World  Magazine1,  this  is  the  first  framework  that 
integrates  rigorous  risk  management  tools  “...that  meets  the  conflicting  goals  of  performance, 
sensitive  data  disclosure  risk  and  resource  allocation  costs  getting  weighed  and  balanced.”.  The 
framework  proposed  as  a  part  of  this  proposal  resulted  in  numerous  publications  in  top  security, 
data  management  and  cloud  computing  venues  and  already  received  hundreds  of  citations 
according  to  Google  Scholar.  We  summarize  the  general  applicability  of  the  above  framework  by 
briefly  discussion  how  it  is  applied  in  two  very  different  application  settings.  In  addition  to 
following  examples  and  contributions,  we  developed  the  first  encrypted  key-value  store  that 
supports  efficient  search  and  access  control  capabilities  for  hybrid  clouds. 

2.  Risk-based  Query  Processing  in  Hybrid  Clouds  [3] 

An  emerging  trend  in  cloud  computing  is  that  of  hybrid  cloud.  Unlike  traditional  outsourcing 
where  organizations  push  their  data  and  data  processing  to  the  cloud,  in  hybrid  clouds  in-house 
capabilities/  resources  at  the  end-user  site  are  seamlessly  integrated  with  cloud  services  to  create 
a  powerful,  yet  cost-effective  data  processing  solution.  Hybrid  cloud  solutions  offer  similar 
benefits  as  traditional  cloud  solutions.  Yet,  they  provide  advantages  in  terms  of  disclosure 
control  and  minimizing  cloud  resources  given  that  most  organizations  already  have  an 
infrastructure  they  can  use.  Exploiting  such  benefits,  however,  opens  numerous  questions,  the 
foremost  of  which  is  how  should  one  split  the  data  and  computation  between  the  public  and 
private  sides  of  the  infrastructure?  Different  choices  have  different  implications  from  the 
perspectives  of  sensitive  data  disclosure,  computational  performance  and  resource  allocation 


1  Christine  Burns  Rudalevige,  "Hybrid  clouds  pose  new  security  challenges",  Network  World 
http://www.networkworld.eom/a  rticle/2163059/cloud-computing/hybrid-clouds-pose-new-security- 
challenges.html 
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costs.  On  one  extreme,  one  may  choose  to  outsource  the  entire  data  and  workload  to  the  public 
cloud  (as  is  typical  to  outsourcing  solutions).  While  simple  to  implement,  such  a  solution,  incurs 
the  highest  resource  allocation  cost  in  terms  of  cloud  service  (both  storage  and  computing),  and 
is  most  vulnerable  to  data  leakage.  In  addition,  the  outsourcing  strategy  may  not  even  be  optimal 
in  terms  of  performance  since  it  wastes  local  resources  which  are  now  unused.  An  alternate 
strategy  might  be  to  replicate  data  at  both,  the  private  and  public  sides,  and  to  split  the  workload 
between  the  two  sides.  While  simple  queries  may  be  computed  on  the  private  side,  the  complex 
ones  can  be  performed  over  the  public  infrastructure.  The  above  strategy  exploits  local  resources, 
and  thereby  reduces  the  cost  of  the  required  cloud  services.  However,  the  resource  allocation  cost 
and  the  amount  of  sensitive  data  that  is  exposed  to  the  public  cloud  will  be  maximum  in  this 
case.  Another  possibility  could  be  to  only  replicate  some  part  of  the  data  to  the  public  side  so  as 
to  enable  the  distribution  of  the  computation  while  limiting  the  disclosure  risks  and  resource 
allocation  costs  to  the  desired  thresholds.  The  possibilities  described  above  are  just  three  of  the 
multitude  of  computation  partitioning  choices.  The  third  option  seems  to  be  the  best  one  in  terms 
of  various  end-user  requirements  such  as  performance,  costs,  and  sensitive  data  exposure.  An 
observation  to  be  made  here  is  that  as  different  variants  of  the  computation  partitioning  problem 
are  formulated,  a  myriad  of  design  choices  present  themselves.  These  choices  are  based  on 
various  data  and  workload 
formats  (dynamic  queries  or 
batch  jobs),  as  well  as  different 
query  execution  techniques  over 
hybrid  clouds. 


Relations  R 
Results  for  Qpub 


Queries  Q 


User  Interface  Layer 


Constraints  C 

Results  for  Qpriv 


Statistics  Gathering  Layer 


Rpub,  Qpub 


R;  Qpri 


Public 


In  this  specific  work,  we  Data  and  Query  Management  Layer 

formalized  our  generic  risk 
management  framework  for  the 
computation  (and  the  implied 
data)  partitioning  problem  for 
hybrid  clouds  and  developed  a 
framework  for  splitting  data 
processing  tasks  such  that  the 
desired  goals  of  performance, 
disclosure  risk  and  monetary 

expenses  are  achieved.  In  particular,  given  a  workload  of  jobs  (specifically  SQL  style  HIVE)  the 
underlying  dataset  (assumed  to  be  relational)  and  the  machine  characteristics  of  private  and 
public  clouds,  we  proposed  a  dynamic  programming  approach  to  solve  the  computation 
partitioning  problem. 


Private 


Figure  1:  Proposed  Architecture 


2  Here  public  cloud  could  be  considered  as  untrusted  larger  cloud  infrastructure,  and  private  cloud  could  be 
considered,  small  but  trusted  infrastructure.  Therefore,  proposed  solution  could  be  used  to  reduce  the  trust  in  a 
given  infrastructure. 
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2.1. Proposed  Architecture 

Overview  of  our  proposed  system  architecture  are  given  in  Figure  1 .  The  system  mainly  consists 
of  two  components:  The  Statistics  Gathering  layer  performs  the  task  of  statistics  collection  over 
the  dataset  and  query  jobs,  while  the  Data  and  Query  Management  layer  decides  on  the  data  and 
workload  partitioning  for  the  given  set  of  queries.  Our  focus  in  this  work  was  on  the  Data  and 
Query  Management  layer  of  the  system,  though,  as  will  become  clear,  statistics  gathering  is 
essential  to  determine  optimal  query  workload  and  data  distribution. 

A  user  starts  by  submitting  a  set  of  relations,  R  =  {Ri,R2  •  •  •  ,Rm},  a  query  workload,  Q  = 

{qi,  q2, .  .  .  ,  qn},  and  a  set  of  resource  allocation  and  sensitive  data  disclosure  constraints,  C.  The 
system  initially  performs  the  task  of  statistics  collection  over  R  and  Q  using  the  statistics 
gathering  module.  This  module  estimates  the  minimum  set  of  required  data  items  and  the  I/O 
sizes  (alternatively  running  time)  of  base  relations  required  to  answer  each  query  in  Q. 
Additionally,  the  statistics  SR  are  created  as  equi-width  histograms  and  sent  to  the  estimator 
modules.  The  computation  partitioning  module  receives  R,  Q,  C  as  well  as  the  estimated  I/O 
sizes  and  the  minimum  required  set  of  data  items  for  each  query  in  Q,  and  then  systematically 
solves  the  computation  partitioning  problem,  CPP.  In  solving  CPP,  the  monetary  cost  estimator 
is  used  by  our  algorithm  to  estimate  the  monetary  costs  of  processing  public  cloud  queries  as 
well  as  storing  intermediate  public  side  data  partitions,  whereas  the  disclosure  risk  estimator  is 
used  to  compute  the  amount  of  sensitivity  that  a  solution  candidate  includes.  On  solving  CPP, 
this  layer  produces  two  outputs:  Rpub  (the  public  cloud  portion  of  R;)  and  furthermore,  Qpub  the 
set  of  queries  that  will  be  executed  over  the  public  cloud.  The  private  cloud  stores  the  entire 
dataset  R,  whereas  the  public  cloud  only  maintains  the  public-side  data  partition,  Rpub-  The  non¬ 
sensitive  and  sensitive  data  in  Rpub  and  R  are  stored  using  an  appropriate  representation 
technique  on  the  public  and  private  clouds  respectively.  Once  the  system  has  stored  the  data 
based  on  the  solution  to  CPP,  the  system  is  now  ready  to  support  query  processing. 

2.2,  Modification  of  Generic  Framework  for  Hybrid  Cloud  Setting 

Let  sens(R’)  be  the  estimated  number  of  sensitive  cells  in  dataset  R’,  baseTables(q)  be  the 
estimated  minimum  set  of  data  items  necessary  to  answer  query  q  C  Q,  runTx(q)  be  the 
estimated  running  time  of  query  q  C  Q  at  site  x  (either  public  or  private),  ORunT(Q’,Q”)  be  the 
Overall  execution  time  of  queries  in  Q’,  given  that  queries  in  Q”  are  executed  on  the  public 
cloud,  freq(q)  be  the  frequency  of  running  query  q,  MC  be  the  defined  monetary  constraint,  and 
DC  be  the  defined  sensitive  data  disclosure  upper  bound  measured  as  number  of  sensitive  items 
outsourced  to  the  cloud,  stor(Rpub)  be  the  storage  monetary  cost  of  the  public  cloud  partition, 
proc(q)  be  the  processing  monetary  cost  of  a  public  side  query  q,  than  we  can  rewrite  our  generic 
formulization  as  follows:  . 

minimize  ORunT(Q,  Qpub ) 

subject  to  (1)  store{Rpub )  +  ^  freq(q)  x  proc(q)  <  MC 

<le  Qpub 

(2)  sens(Rpub)  <  DC 

(3)  \/qe  Qpub  baseTables(q)  c  Rpub 

3  Our  framework  could  use  any  other  sensitive  data  disclosure  risk  measure  as  well. 


DISTRIBUTION  A:  Distribution  approved  for  public  release. 


We  showed  that  in  our  work,  the  above  optimization  problem  could  be  solved  using  dynamic 
programming  to  find  optimal  workload  partitioning  that  balances  risk,  computation  monetary 
cost  and  run  time. 


2.3.0verview  of  the  Experimental  Results 
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Using  existing  TPC-H  benchmark,  and  realistic  cloud  settings  inspired  by  Amazon  prices,  we 
run  experiments  where  public  cloud  is  at  least  3  times  more  powerful  than  the  private  cloud.  In 
our  experiments,  the 
resource  allocation  cost  was 
varied  between  25-  50%  of 
the  total  maximum  value 
that  was  defined  by  the 
user.  We  defined  four 
different  overall  sensitivity 
levels  as,  No-Sensitivity 
(the  entire  dataset  is  non¬ 
sensitive),  1%-  Sensitivity, 
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5%  Sensitivity  and  10%-  Figure  2:  Overview  of  Results 
Sensitivity  (1%,  5%  and 

10%  of  the  tuples  of  the  lineitem  table  used  in  TPC-H  benchmark  are  made  sensitive).  We 
defined  seven  different  sensitive  data  exposure  levels  as  0%  (none  of  the  sensitive  data  is 
exposed),  10%,  25%,  40%,  50%,  75%  and  100%  (all  of  the  defined  sensitive  data  may  be 
exposed).  We  then  computed  the  overall  performance  of  the  query  workload  for  different 
combinations  of  these  three  parameters,  the  results  of  which  are  presented  in  Figure  2.  One  of  the 
first  observations  that  can  be  made  from  Figure  2  is  that  when  a  user  is  willing  to  take  additional 
risks  by  storing  more  sensitive  data  on  the  public  side,  they  can  gain  a  considerable  speed-up  in 
overall  execution  time. 


Figure  2  also  shows  that  when  a  user  invests  more  capital  towards  resource  allocation,  a 
considerable  gain  in  overall  workload  performance  (even  greater  than  50%)  can  be  achieved. 
This  is  expected  since  when  more  resources  are  allocated  on  the  public  side,  we  are  better  able  to 
exploit  the  parallelism  that  is  afforded  by  a  hybrid  cloud.  Thus,  the  intuition  that  a  hybrid  cloud 
improves  performance  due  to  greater  use  of  inherent  parallelism  is  justified.  Finally,  from  Figure 
2,  we  also  notice  that  we  can  achieve  a  considerable  improvement  in  query  performance  (~  50%) 
for  a  relatively  low  risk  (~  40%)  and  resource  allocation  cost  (~  50%). 

3.  Managing  Sensitive  Encryption  Key  Exposure  Risks  in  Public  Clouds  [7] 

Despite  its  numerous  advantages,  cloud  computing  also  introduces  new  challenges  and  concerns, 
primarily  security  and  privacy  risks.  The  concerns  simply  stem  from  outsourcing  critical  data 
(e.g.,  health  records,  social  security  numbers,  or  even  cryptographic  keys)  and/or  computing 
capabilities  to  a  distant  computing  environment,  where  the  resources  are  shared  with  other 
potentially  untrusted  customers. 
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In  particular,  to  increase  efficiency  and  to  reduce  costs,  a  CSP  may  place  multiple  virtual 
machines  (VMs),  belonging  to  different  customers,  to  the  same  physical  machine.  In  such  an 
execution  platform,  VMs  should  be  logically  isolated  from  each  other  to  protect  the  privacy  of 
each  client.  The  CSPs  use  virtual  machine  monitors  (VMM)  to  realize  logical  isolation  among 
VMs  running  on  the  same  physical  machine.  However,  the  works  that  specially  target  public 
cloud  infrastructures  have  shown  that  a  clever  adversary  can  perform  cross-  VM  side-channel 
attacks  (for  brevity,  cross-VM  attack)  to  learn  private  information  that  resides  in  another  VM, 
even  under  carefully  enforced  logical  isolation.  Initially,  Ristenpart  et  al.4  showed  heuristics  to 
improve  an  adversary’s  capabilities  to  place  its  VMs  alongside  the  victim  VMs,  and  learn  crude 
information  (e.g.,  aggregate  cache  usage).  Even  worse,  Zhang  et  al.5  managed  to  extract 
ElGamal  decryption  keys  by  cross-VM  attacks. 

These  works  have  demonstrated  that  logical  isolation  and  trustworthy  cloud  provider  are  not 
necessarily  enough  to  guarantee  the  security  of  sensitive  information.  It  would  be  too  optimistic 
to  assume  that  an  adversary  is  only  limited  to  the  two  aforementioned  attacks.  Unfortunately, 
there  exists  a  wide  variety  of  side-channel  attacks,  each  with  its  own  setup  and  methodology. 
Simply,  the  absence  of  such  attacks  on  public  cloud  infrastructures  does  not  necessarily  mean 
that  they  are  inapplicable. 

To  this  end,  we  developed  HERMES,  a  system  that  remedies  the  cryptographic  key  disclosure 
vulnerabilities  of  VMs  in  the  public  cloud  by  using  well-established  cryptographic  tools  such  as 
Secret  Sharing  and  Threshold  Cryptography.  Specifically,  the  key  technique  in  our  system  is  to 
partition  a  cryptographic  key  into  several  pieces,  which  are  computed  using  threshold 
cryptosystems,  and  to  store  each  share  on  a  different  VM.  This  makes  it  harder  for  an  adversary 
to  capture  the  complete  cryptographic  key  itself,  since  it  now  has  to  extract  shares  from  multiple 
VMs  (note  that  there  is  no  single  key  or  a  centralized  key  anymore  in  HERMES).  To  further 
improve  the  resilience,  the  same  cryptographic  key  is  re-shared  periodically,  so  that  a  share  is 
meaningful  in  only  one  time  period/epoch.  Consequently,  we  introduce  two  significant 
challenges  against  a  successful  attack:  (i)  Multiple  VMs  should  be  attacked,  and  (ii)  each  attack 
should  succeed  within  a  certain  time  period. 

Using  our  generic  model,  we  formalize  the  problem  of  finding  good  HERMES  configurations 
(e.g.,  how  many  shares  of  each  key,  and  how  many  shares  are  needed  to  reconstruct  the  secret), 
which  minimizes  the  security  risk  for  given  monetary  and  performance  constraints. 

3.1.  Risk-Aware  Parameter  Setting  Mechanism  for  Protecting  Sensitive  Keys  In  The 
Clouds 

In  our  formalization,  we  consider  three  main  aspects:  security,  cost,  and  performance.  Security 
aspect  allows  us  to  provide  an  upper  bound  on  the  possibility  of  a  successful  key  extraction 
attack  on  HERMES  for  the  given  k  (shares  needed  for  correct  decryption  using  the  protected 
private  key),  1  (total  number  of  shares  of  the  private  key),  and  t  (time  to  recreate  and  reshare  the 


Ristenpart,  T.,  Tromer,  E.,  Shacham,  H.,  and  Savage,  S.  Hey,  you,  get  off  of  my  cloud:  exploring  information  leakage  in  third- 
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secret  key)  values.  Theoretically,  increasing  k  and  1,  or  decreasing  t  will  make  it  harder  for  the 
adversary  to  achieve  its  goal  by  recovering  keys.  However,  increasing  1  implies  more  defender 
VMs  running  on  the  cloud,  which  increases  the  total  cost.  Moreover,  our  experiments  showed 
that  the  performance  degrades  as  1  and  k  increase  together.  Hence,  the  optimal  values  should  be 
assigned  to  k,  1,  t  for  the  given  constraints  (e.g.,  budget,  performance  limit). 


Measuring  Security:  To  quantify  the  probability  of  a  successful  attack  in  an  epoch,  we  assume 
that  the  adversary  has  to  start  from  scratch  in  each  epoch,  which  implies  that  it  loses  all  its 
previously  acquired  information.  This  is  a  valid  assumption,  since  shares  for  each  epoch  are 
independent  from  one  another,  and  a  captured  share  does  not  contribute  any  information  to  the 
next  epoch.  The  inability  of  conducting  acquired  information  to  the  following  epochs  makes  it 
convincing  to  model  the  probability  of  a  successful  attack  as  an  exponentially  distributed  random 
variable.  Given  the  success  rate  parameter  0,  the  probability  distribution  for  the  attack  is: 


m 


ie-'/9  if  /  >  0 

0  otherwise 


Since  the  exponential  distribution  is  memoryless  and  the  cryptographic  key  is  re-shared  in  each 
epoch,  we  can  simply  assume  that  the  input  to  f  is  the  time  difference  from  the  last  re-sharing 
moment.  Then,  given  the  length  of  the  epoch  x,  the  probability  of  a  successful  attack  is: 


f(  t,0)=  f  f(t).dt  =  i-<rT/e 
Jo 

Finally,  assuming  that  the  probability  of  capturing  shares  from  a  single  VM  is  identical  to  and 
independent  from  all  other  VMs,  the  probability  of  capturing  at  least  k  shares  from  1  defender 
VMs  in  an  epoch  is  (which  we  use  as  a  way  to  measure  the  security  of  the  system): 

Sec(l,k,T,e)  =  1£(l\l-e-'/6)i(e-'/e)l-i 

i—k  \  / 

Measuring  Cost:  Modeling  monetary  cost  in  HERMES  is  rather  simple  compared  to  the  other 
two  aspects.  Assuming  that  the  cloud  provider  does  not  charge  money  for  the  inter- VM 
communications,  the  total  monetary  cost  is  Cost(l)  =  1.(3,  where  (3  is  the  unit  cost  of  running  a 
single  VM  on  the  cloud  provider.  The  cost  of  communication  with  the  client  is  also  neglected, 
since  this  is  not  an  additional  cost  incurred  by  HERMES. 


Measuring  Performance:  The  method  to  formalize  the  expected  performance  depends  heavily 
on  the  application  that  HERMES  is  running  for,  and  the  metrics  that  the  defender  considers.  For 
instance,  one  may  value  throughput  more  than  the  latency  while  running  HERMES.  On  the  other 
hand,  the  effects  of  changing  parameters  (i.e.,  k,  1)  in  the  mail  server  case  study  are  far  different 
than  changing  the  same  parameters  in  the  micro  benchmarking  experiments.  For  brevity,  we 
show  the  performance  of  HERMES  for  the  given  k  and  1  as  Perf  (l,k),  and  leave  it  to  the  defender 
to  define  the  characteristics  of  the  function. 


Optimization  Problem:  Given  the  success  rate  parameter  0,  the  unit  cost  of  a  VM  (3,  the  budget 
limit  Lcost,  and  the  performance  limit  Lperf,  the  aim  of  the  optimization  problem  is  to  minimize  the 
probability  of  a  successful  attack  in  an  epoch  while  keeping  the  total  monetary  cost  below  Lcost 
and  the  performance  below  Lperf .  Formally,  the  optimization  problem  is: 

minimize:  Sec(l,k,x,6) 
subject  to:  Cost{l )  <  Lcost,  Perf  (/,  k)  <  Lperf 
l  >  k  >  1,  T  >  0 
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3.2. Application  of  Secure  Multi-objective  Optimization  Framework  for  Micro 
Benchmarking 

Modeling  performance  is  highly  dependent  on  the  case  study  and  the  aimed  configuration,  thus  it 
is  challenging  to  apply  the  optimization  to  every  single  case.  Instead,  we  targeted  to  optimize 
HERMES  for  100  concurrent  clients  in  the  micro  benchmarking  scenario,  since  all  experiment 
results  for  the  chosen  configuration  are  given  in  our  work  [7].  For  brevity,  we  make  a  further 
assignment  of  parameters  by  choosing  re-sharing  period  as  x=  5  sec  and  success  rate  parameter 
as  0=  3600.  x  =  5  sec  is  the  smallest  value  that  we  have  tested,  and  is  a  valid  value  that  allows 
HERMES  to  complete  several  computations  in  each  epoch.  Furthermore,  choosing  small  re¬ 
sharing  period  will  tighten  the  overall  security,  since  the  adversary  has  to  complete  the  attack  in 
a  very  short  period.  On  the  other  hand,  choosing  0  as  3600  is  due  to  the  existing  cross-VM 
attacks,  which  necessitates  hours  to  capture  the  cryptographic  key.  In  an  exponential  distribution, 
expected  waiting  time  to  observe  one  success  is  0.  Since,  we  expect  the  attack  to  succeed  in  an 
hour,  we  assign  0  =  3600,  representing  the  number  of  seconds  in  an  hour.  In  addition,  we  check  0 
=  600  secs  to  observe  changes  in  optimal  values,  probabilities  in  an  epoch  for  fixed  expected 
latency  limit 

In  this  example,  we  picked  latency  as  the  target  performance  metric  to  consider,  assuming  that 
the  defender  aimed  to  serve  100  concurrent  clients  as  fast  as  possible.  The  important  step  to 
model  performance  is  to  figure  out  Perf  (l,k).  To  overcome  this,  we  applied  multiple  linear 
regression  on  our  experiment  results,  and  came  up  with  a  formula  that  gives  the  expected  latency 
value  for  the  given  1  and  k  values.  As  it  is  challenging  to  test  every  possible  formula,  and 
increasing  the  number  of  variables  may  over-fit  the  training  data,  we  chose  a  simple  polynomial 
Perf  (l,k)  =  Co  +ci.l  +C2.k+C3.(l/k)  to  model  the  expected  latency,  where  the  coefficients  are  Co 
=118,  Ci  =18,  C2  =31,  and  C3  =7  learned  from  existing  performance  data.  Finally,  to  observe  the 
effects  of  different  performance  limits  Lperf,  we  calculated  optimal  HERMES  setups  for  Lperf 
£  [50,200].  Finally,  assuming  that  the  defender  will  use  the  cheapest  VM  instance  on  Amazon 
EC2,  she  will  pay  $0.02  per  hour,  which  is  approximately  $175  per  year.  We  vary  the  monetary 
budget  between  $350  per  year  and  $2800  per  year  to  check  optimal  values. 


e 

=  600 

e  = 

=  3600 

Leo  st  /yr 

Conf. 

SecO 

Conf. 

SecO 

$1820 

(2,2) 

6.8  •  10  5 

(2,2) 

1.9  •  10-6 

$3640 

(4.3) 

2.2  - 10“6 

(4,3) 

3.7  - 10“8 

$7280 

(8,5) 

2.1  •  10~9 

(8,5) 

2.8  •  10  13 

$14560 

(16,10) 

1.1  •  10“17 

(16,10) 

2.1  -10“25 

Table  1:  Attack  success  probabilities  for  different  system  parameters 


Table  1  shows  the  results  of  the  optimization  procedure  for  varying  monetary  budget,  and  fixed 
Lperf  =150.  The  results  include  the  optimal  HERMES  setup  and  the  probability  of  a  successful 
attack  in  one  epoch,  for  both  0  =  3600  and  600.  We  observe  that  as  we  increase  the  monetary 
budget,  HERMES  is  allowed  to  run  with  more  VMs,  resulting  in  lower  probabilities  of  success 
for  the  adversary.  For  instance,  when  the  budget  is  $7280  per  year  and  0  =3600,  HERMES  can 
be  configured  to  run  in  (8,5)  setup  (i.e.,  divide  the  secret  key  into  8  shares  where  any  5  share  can 
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jointly  decrypt  a  message),  while  the  adversary  has  only  2.8x10’  chance  to  capture  the 
partitioned  cryptographic  key.  Please  see  [7]  for  more  experimental  results. 

In  summary,  we  present  HERMES,  a  novel  system  to  protect  cryptographic  keys  in  cloud  VMs. 
The  key  idea  is  to  periodically  partition  a  cryptographic  key  using  additive  or  Shamir  secret 
sharing.  With  two  different  case  studies,  we  show  that  the  overhead  can  be  as  low  as  1%.  With 
such  small  overhead  in  an  average  request,  cryptographic  keys  become  more  leakage-resilient 
against  any  adversary.  Furthermore,  we  model  the  problem  of  finding  optimal  parameters  for  the 
given  monetary  and  performance  constraints,  which  minimizes  the  security  risk.  Using  our 
formal  model,  the  defender  can  calculate  the  probability  of  a  successful  attack,  and  take 
precautions  (e.g.,  increase  the  number  of  VMs,  decrease  epoch  length). 
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